Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autosplitter for big generated java methods #10367

Closed
wants to merge 2 commits into from

Conversation

Okapist
Copy link
Contributor

@Okapist Okapist commented Aug 5, 2022

Fix for #10247

All unit tests ok with very small kMaxFieldsInMethod=4

Round trip benchmark

ByteArrayOutputStream baos = new ByteArrayOutputStream();
protoOld.writeTo(baos);

orig.protobuf_unittest.TestAnyProto.TestAny testAny =
        orig.protobuf_unittest.TestAnyProto.TestAny.parseFrom(baos.toByteArray());
blackhole.consume(testAny);


java -version
openjdk version "17.0.3" 2022-04-19
OpenJDK Runtime Environment Temurin-17.0.3+7 (build 17.0.3+7)
OpenJDK 64-Bit Server VM Temurin-17.0.3+7 (build 17.0.3+7, mixed mode, sharing)

Check small proto with 15 field. All fields inited

Benchmark      Mode  Cnt    Score   Error  Units
Test.checkNew  avgt    2  836,389          ns/op
Test.checkOld  avgt    2  933,154          ns/op

Check small proto with 15 field. 2 fields inited, 14 empty

Benchmark                    Mode  Cnt    Score   Error  Units
Test.checkNew15fields13empy  avgt    2  251,938          ns/op
Test.checkOld15fields13empy  avgt    2  334,046          ns/op

Check mid size proto. Here split code work and JIT work on old and on new code.

Benchmark              Mode  Cnt     Score   Error  Units
Test.checkNew70fields  avgt    2  1637,487          ns/op
Test.checkOld70fields  avgt    2  1621,035          ns/op

Check big proto. Around 900 fields. Here split code work and JIT not work on old code.

Benchmark                    Mode  Cnt      Score   Error  Units
TestVisit.checkNew800fields  avgt    2   6239,123          ns/op
TestVisit.checkOld800fields  avgt    2  36872,855          ns/op

I see no difference or slight increase speed in small or medium proto files and great speed increase in large proto files.

@google-cla
Copy link

google-cla bot commented Aug 5, 2022

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@Okapist Okapist changed the title Protocolbuffers main Autosplitter for big generated methods Aug 5, 2022
@Okapist Okapist changed the title Autosplitter for big generated methods Autosplitter for big generated java methods Aug 5, 2022
@fowles fowles requested a review from googleberg August 5, 2022 17:16
@fowles
Copy link
Contributor

fowles commented Aug 5, 2022

We will need you to sign the CLA before we can review this change.

Signed-off-by: Okapist <anonbk@gmail.com>
@Okapist
Copy link
Contributor Author

Okapist commented Aug 6, 2022

We will need you to sign the CLA before we can review this change.

CLA signed

@fowles
Copy link
Contributor

fowles commented Aug 7, 2022

Just a warning that the best person to review this is out for a bit, so you should expect silence for at least a week

@googleberg
Copy link
Member

googleberg commented Aug 18, 2022 via email

@googleberg
Copy link
Member

@Okapist these are some intriguing results. Thank you for trying this out!

Before we go down this path of introducing complexity through splitting and continue to expand the size of generated message code, I'd like us to try comparing the performance using the Lite approach to these methods.

See GeneratedMessageLite.hashCode, .equals, .writeTo which use the Schema utility to implement these methods.

Would you be willing to take a crack at that for comparison? I have a pressing project that will likely delay me from investigating that approach myself. The goal is to balance performance with the maintenance load from having so many approaches to the problem.

@Okapist
Copy link
Contributor Author

Okapist commented Aug 23, 2022

Yep.
No changes. It's generates the absolutely same java code for MessageLite. There no big methods generation in lite approach.

Performance check show no difference as expected.

Benchmark                     Mode  Cnt    Score   Error  Units
Test.checkNew70fieldsWriteTo  avgt    2  810,307          ns/op
Test.checkOld70fieldsWriteTo  avgt    2  835,257          ns/op

Benchmark                       Mode  Cnt     Score   Error  Units
Test.checkNew70fieldsRoundtrip  avgt    2  1842,443          ns/op
Test.checkOld70fieldsRoundtrip  avgt    2  1968,669          ns/op

Benchmark                    Mode  Cnt  Score   Error  Units
Test.checkNew70fieldsEquals  avgt    2  0,822          ns/op
Test.checkOld70fieldsEquals  avgt    2  0,889          ns/op

@googleberg
Copy link
Member

@Okapist Thanks for running this!

If I understand, these are results for running the other methods with the split in place. Which will be useful for the comparison I really want to run which is using the Lite implementation of these methods vs current impl and split impl.

We could then decide if we should introduce splitting or use the Lite implementation.

@Okapist
Copy link
Contributor Author

Okapist commented Aug 29, 2022

Lite implementation is slow. Usually we use serialize and deserialize only. Lite is 20-30% slower than normal.
Use normal implementation for all cases except big proto is frustrating and hard to support.

Copy link
Member

@googleberg googleberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we proceed with this implementation, what is the reason for incrementally deciding to split vs just looking at the number of fields up front and branching within the top-level method?

@@ -1120,6 +1122,22 @@ void EscapeUtf16ToString(uint16_t code, std::string* output) {
}
}

void MaybeSplitJavaMethod(io::Printer *printer, int* fields_in_function, int* method_num,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we MaybeSplitJavaMethod? Can't we know ahead of time (by the number of fields) how we need to split the top-level method?

@ljw-hit
Copy link

ljw-hit commented Dec 6, 2023

@Okapist @googleberg Hi, I would like to know the current status of this PR. Has the ability of autosplitter been integrated? I think this feature is still very important. Big proto will seriously affect the performance of serialization and deserialization in Java. Even if you encounter a proto with too many lines and the generated java method exceeds 64k, it cannot be used at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants